2008 ICFP Programming Contest

Mark your calendars for Friday, July 11, 2008 to Monday, July 14,
2008: the dates for the eleventh annual ICFP Programming Contest.

The ICFP Programming Contest is one of the most advanced and
prestigious programming contests, as well as being a chance to show
off your programming skills, your favorite languages and tools, and
your ability to work as a team. The contest is affiliated with the
International Conference on Functional Programming. Teams consisting
of one or more participants, from any part of the world, using any
programming language, may enter.

The specific task will be announced when the contest begins. In the
meantime, watch the Web site for more information:

http://icfpcontest.org/

Please direct any questions to Tim Sheard at sheard@cs.pdx.edu.

-Tim Chevalier, on behalf of the 2008 contest organizers
(programming language devotees at Portland State University and
the University of Chicago)

Map-reduce-merge: simplified relational data processing on large clusters

Map-reduce-merge: simplified relational data processing on large clusters (freely-accessible slides). Hung-chih Yang, Ali Dasdan, Ruey-Lung Hsiao, D. Stott Parker. 2007 ACM SIGMOD conference.

Map-Reduce is a programming model that enables easy development of scalable parallel applications to process a vast amount of data on large clusters of commodity machines. Through a simple interface with two functions, map and reduce, this model facilitates parallel implementation of many real-world tasks such as data processing jobs for search engines and machine learning.

However,this model does not directly support processing multiple related heterogeneous datasets. While processing relational data is a common need, this limitation causes difficulties and/or inefficiency when Map-Reduce is applied on relational operations like joins.

We improve Map-Reduce into a new model called Map-Reduce-Merge. It adds to Map-Reduce a Merge phase that can efficiently merge data already partitioned and sorted (or hashed) by map and reduce modules. We also demonstrate that this new model can express relational algebra operators as well as implement several join algorithms.

They seem to add a third phase – merge: ((k1, [v1]), (k2, [v2])) → (k3, [v3]) – which combines the outputs of two separate, parallel MapReduce tasks.

This makes it possible to do things like joins and build cartesian products.

Applied Proof Theory: Proof Interpretations and their Use in Mathematics

I mentioned this book in a recent discussion, but I think it might interest members not following that discussion.

Ulrich Kohlenbach presents an applied form of proof theory that has led in recent years to new results in number theory, approximation theory, nonlinear analysis, geodesic geometry and ergodic theory (among others). This applied approach is based on logical transformations (so-called proof interpretations) and concerns the extraction of effective data (such as bounds) from prima facie ineffective proofs as well as new qualitative results such as independence of solutions from certain parameters, generalizations of proofs by elimination of premises.

The book first develops the necessary logical machinery emphasizing novel forms of Gödel's famous functional ('Dialectica') interpretation. It then establishes general logical metatheorems that connect these techniques with concrete mathematics. Finally, two extended case studies (one in approximation theory and one in fixed point theory) show in detail how this machinery can be applied to concrete proofs in different areas of mathematics.

The site includes some sample pages for your reading pleasure. Not ten lines into the preface does Dana Scott appear, and he is clearly one of us...

Read the preface and share your thoughts!

Types Considered Harmful

Benjamin C. Pierce's presentation slides (in PDF) for his talk on Types Considered Harmful. The talk starts out discussing some of the general advantages and disadvantages of static typing. But the aim of the talk centers on the problems of building a type checker for the Boomerang Programming Languague (an offshoot of harmony).

  • Boomerang language design as an example of
    1. the need for very precise types
    2. some of the technical problems they raise
  • Contracts as an attractive way of addressing some of these issues

Pierce's work is currently centered on creating a PL for Bidirectional Programming. A work in progress but it's interesting to see the thought process behind language design in real-time.

Computational Thinking

Computational thinking is a fundamental skill for everyone, not just for computer scientists. To reading, writing, and arithmetic, we should add computational thinking to every child’s analytical ability. Just as the printing press facilitated the spread of the three Rs, what is appropriately incestuous about this vision is that computing and computers facilitate the spread of computational thinking.
Computational thinking involves solving problems, designing systems, and understanding human behavior, by drawing on the concepts fundamental to computer science. Computational thinking includes a range of mental tools that reflect the breadth of the field of computer science.

from Jeannette M. Wing's Computational Thinking Manifesto

The Center for Computation Thinking at CMU has more information about the subject.

We talked briefly about Computational Thinking back in 2006. Recently I listened to Jon Udell's interview with Jeannette Wing and realized that it's time to bring this subject up again for discussion.

Linear Logical Algorithms

Linear Logical Algorithms, Robert J. Simmons and Frank Pfenning, 2008.

Bottom-up logic programming can be used to declaratively specify many algorithms in a succinct and natural way, and McAllester and Ganzinger have shown that it is possible to define a cost semantics that enables reasoning about the running time of algorithms written as inference rules. Previous work with the programming language Lollimon demonstrates the expressive power of logic programming with linear logic in describing algorithms that have imperative elements or that must repeatedly make mutually exclusive choices. In this paper, we identify a bottom-up logic programming language based on linear logic that is amenable to efficient execution and describe a novel cost semantics that can be used for complexity analysis of algorithms expressed in linear logic.

In my last post, I linked to a paper by Ganzinger and McAllester about specifying algorithms as logic programs, and a) admired how concise and natural the programs were, and b) was sad that the logic programs used some "non-logical" operations, such as deletion.

So, what does it mean for an operation to be "non-logical", and why is it a bad thing? Roughly speaking, you can think of the analogy: non-logical operations are to logic programs what impure operations are to functional programs -- they are features which break some parts of the equational theory of the language. Now, the Curry-Howard correspondence for functional programs says that types are propositions, and programs are proofs. It turns out that a different version of this correspondence holds for logic programs: in logic programming, a set of propositions is a program, and the execution of a program corresponds to a process of proof search -- you get a success when execution finds a proof of the goal.

When you have nonlogical operations in your logic programming language, you've introduced operators that don't correspond to valid rules of inference, so even if your logic program succeeds, the success might not correspond to a real proof. Deletion of facts from a database is a good example of a nonlogical operation. Regular intuitionistic and classical logic is monotonic: deduction from premises can only learn new facts, it can never disprove things you already knew to be true. Since deletion removes facts from the set of things you know, it can't have a logical interpretation in classical/intuitionistic logic.

However, it turns out that not all logics are monotonic, and in this paper Simmons and Pfenning show that if you take the language of propositions to be a fragment of linear logic, then all of the operations that Ganzinger and McAllester use actually do have a nice logical interpretation.

Processing.js

John Resig (of jQuery fame) has ported the Processing visualization language to JavaScript.

The examples are remarkable, check them out (but check the browser issues John discusses).

John has a little confession:

The first portion of the project was writing a parser to dynamically convert code written in the Processing language, to JavaScript. This involves a lot of gnarly regular expressions chewing up the code, spitting it out in a format that the browser understands.

It works "fairly well" (in that it's able to handle anything that the processing.org web site throws at it) but I'm sure its total scope is limited (until a proper parser is involved). I felt bad about tackling this using regular expressions until I found out that the original Processing code base did it in the same manner (they now use a real parser, naturally).

Actually that's quite cool in itself (even if angels weep at this parsing code, I think we on LtU shouldn't cast the first stone). DSLs should be easily built and played with. Cleaning up the implementation comes later, if at all.

Purists may not only object to the regular expression parsing, but also to the central line of code which ties things together, namely: eval(parse(code, p)). But then, DSL lovers are not the sort of people to object to eval...

In the old days of LtU we regularly posted links to cool small interpreters that people could play with. Some of the more amusing ones were javascript based, and the page contained a REPL form (Luke, I am talking to you!). It is a shame we don't post more stuff like this, in between the more highbrow discussions...

Logical Algorithms

Logical Algorithms, Harald Ganzinger and David McAllester. ICALP 2002.

It is widely accepted that many algorithms can be concisely and clearly expressed as logical inference rules. However, logic programming has been inappropriate for the study of the running time of algorithms because there has not been a clear and precise model of the run time of a logic program. We present a logic programming model of computation appropriate for the study of the run time of a wide variety of algorithms.

So, there are two main styles in logic programming. The first is Prolog-style goal-directed, or backwards, search. The idea is that you have a set of rules, and a goal, and you nondeterministically choose rules that might have proven that goal, trying to find a sequence of deductions that could have proven this goal. It's called backwards search since you are trying to reason backwards from the goal towards a full proof.

The other style is, naturally, called forwards search (confusingly, this is also called the inverse method in theorem proving). The idea is that you have a goal, and some rules, and a starting set of facts. You then apply the rules to the facts you have, enlarging your database of facts and enabling more deductions. You keep doing this until either you discover the goal you were trying to prove in the database of facts, or the database saturates (ie, no more deductions are provable) and the goal is unprovable. The idea is that your database is an implicit data structure, which you update as part of the search. This makes forwards search a particularly natural method when you're trying to compute closures -- graph algorithms, dataflow analyses, that kind of thing.

While we've discussed applications of forward logic programming before, I thought it might be good to link to a discussion of the methodology of how to specify algorithms as forward logic programs and analyze their complexity.

The language in this paper permits deletion of facts from the database, which is unfortunately a non-logical operation -- in clasical and intuitionistic logic, deduction can only increase the number of facts you know. But with a change of logic, it can be made logical. That'll be the next paper I post a link to. :)

HOPL-III: A History of Erlang

A History of Erlang and the accompanying Presentation Slides by Joe Armstrong are a must read for anyone interested in PL history.

Erlang was designed for writing concurrent programs that "run forever". Erlang uses concurrent processes to structure the program. These processes have no shared memory and communicate by asynchronous message passing. Erlang processes are lightweight and belong to the language and not the operating system. Erlang has mechanisms to allow programs to change code "on the fly" so that programs can evolve and change as they run. These mechanisms simplify the construction of software for implementing non-stop systems.

(Link to previous HOPL-III papers on LtU).

Lambda, the Ultimate TA

Benjamin C. Pierce. Using a Proof Assistant to Teach Programming Language Foundations, or, Lambda, the Ultimate TA. April 2008. White paper.

In Fall 2007, I taught an introductory course on logic and the theory of programming languages entirely in Coq. The experience was quite demanding — for the students and especially for me!— but the overall payoff in terms of student engagement and performance on exams far exceeded my hopes. I am now convinced that this is the right way to teach programming language foundations and am working on course materials that will allow the approach to be replicated elsewhere.

Just this weekend I talked with someone about E-Learning, and he mentioned "immediate nonjudgmental feedback" as one of the main advantages of using computers in the classroom. In the context of teaching logic and the art of constructing proofs this boils down to using proof-assistants as "personal TAs". This is what Pierce tried to do (Paul will be pleased: Coq was the proof assistant Pierce chose), and this note describes his experiences and conclusions.

The teaching materials are available online, if you prefer to look at them and form your own opinion before reading Pierce's conclusions...